-
Notifications
You must be signed in to change notification settings - Fork 17
[ML-42739] Add custom forecasting data splits for automl_runtime #145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
while result[-1] <= max(df["ds"]) - horizon_dateoffset: | ||
cutoff += period_dateoffset | ||
if not (((df["ds"] > cutoff) & (df["ds"] <= cutoff + horizon_dateoffset)).any()): | ||
if cutoff < df["ds"].max(): | ||
closest_date = df[df["ds"] > cutoff].min()["ds"] | ||
cutoff = closest_date - horizon_dateoffset | ||
result.append(cutoff) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add comments like those in L173-182? Would be very helpful for review and when we look back in the future. E.g. is the cutoff
in result
guaranteed to be an existing data point in df
or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also i'm not completely sure whether simply reverting -= to += is just correct. I'll take a deeper look later
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments added.
Can you add more details in the PR description |
max_cutoff = max(df["ds"]) - horizon_dateoffset | ||
while cutoff <= max_cutoff: | ||
# If data does not exist in data range (cutoff, cutoff + horizon_dateoffset] | ||
if (not (((df["ds"] > cutoff) & (df["ds"] <= cutoff + horizon_dateoffset)).any())) and (cutoff < df["ds"].max()): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove this redundant condition as we discussed (cutoff < df["ds"].max()
?
This PR enables custom forecasting data splits. There are three main changes.